63 research outputs found
A Super-Fast Distributed Algorithm for Bipartite Metric Facility Location
The \textit{facility location} problem consists of a set of
\textit{facilities} , a set of \textit{clients} , an
\textit{opening cost} associated with each facility , and a
\textit{connection cost} between each facility and client
. The goal is to find a subset of facilities to \textit{open}, and to
connect each client to an open facility, so as to minimize the total facility
opening costs plus connection costs. This paper presents the first
expected-sub-logarithmic-round distributed O(1)-approximation algorithm in the
model for the \textit{metric} facility location problem on
the complete bipartite network with parts and . Our
algorithm has an expected running time of rounds, where . This result can be viewed as a continuation
of our recent work (ICALP 2012) in which we presented the first
sub-logarithmic-round distributed O(1)-approximation algorithm for metric
facility location on a \textit{clique} network. The bipartite setting presents
several new challenges not present in the problem on a clique network. We
present two new techniques to overcome these challenges. (i) In order to deal
with the problem of not being able to choose appropriate probabilities (due to
lack of adequate knowledge), we design an algorithm that performs a random walk
over a probability space and analyze the progress our algorithm makes as the
random walk proceeds. (ii) In order to deal with a problem of quickly
disseminating a collection of messages, possibly containing many duplicates,
over the bipartite network, we design a probabilistic hashing scheme that
delivers all of the messages in expected- rounds.Comment: 22 pages. This is the full version of a paper that appeared in DISC
201
Lessons from the Congested Clique Applied to MapReduce
The main results of this paper are (I) a simulation algorithm which, under
quite general constraints, transforms algorithms running on the Congested
Clique into algorithms running in the MapReduce model, and (II) a distributed
-coloring algorithm running on the Congested Clique which has an
expected running time of (i) rounds, if ;
and (ii) rounds otherwise. Applying the simulation theorem to
the Congested-Clique -coloring algorithm yields an -round
-coloring algorithm in the MapReduce model.
Our simulation algorithm illustrates a natural correspondence between
per-node bandwidth in the Congested Clique model and memory per machine in the
MapReduce model. In the Congested Clique (and more generally, any network in
the model), the major impediment to constructing fast
algorithms is the restriction on message sizes. Similarly, in the
MapReduce model, the combined restrictions on memory per machine and total
system memory have a dominant effect on algorithm design. In showing a fairly
general simulation algorithm, we highlight the similarities and differences
between these models.Comment: 15 page
High Entropy Random Selection Protocols
In this paper, we construct protocols for two parties that do not trust each other,
to generate random variables with high Shannon entropy.
We improve known bounds for the trade off between the number of rounds, length of communication and the entropy of the outcome
Almost-Tight Distributed Minimum Cut Algorithms
We study the problem of computing the minimum cut in a weighted distributed
message-passing networks (the CONGEST model). Let be the minimum cut,
be the number of nodes in the network, and be the network diameter. Our
algorithm can compute exactly in time. To the best of our knowledge, this is the first paper that
explicitly studies computing the exact minimum cut in the distributed setting.
Previously, non-trivial sublinear time algorithms for this problem are known
only for unweighted graphs when due to Pritchard and
Thurimella's -time and -time algorithms for
computing -edge-connected and -edge-connected components.
By using the edge sampling technique of Karger's, we can convert this
algorithm into a -approximation -time algorithm for any . This improves
over the previous -approximation -time algorithm and
-approximation -time algorithm of Ghaffari and Kuhn. Due to the lower
bound of by Das Sarma et al. which holds for any
approximation algorithm, this running time is tight up to a factor.
To get the stated running time, we developed an approximation algorithm which
combines the ideas of Thorup's algorithm and Matula's contraction algorithm. It
saves an factor as compared to applying Thorup's tree
packing theorem directly. Then, we combine Kutten and Peleg's tree partitioning
algorithm and Karger's dynamic programming to achieve an efficient distributed
algorithm that finds the minimum cut when we are given a spanning tree that
crosses the minimum cut exactly once
Distributed Testing of Excluded Subgraphs
We study property testing in the context of distributed computing, under the
classical CONGEST model. It is known that testing whether a graph is
triangle-free can be done in a constant number of rounds, where the constant
depends on how far the input graph is from being triangle-free. We show that,
for every connected 4-node graph H, testing whether a graph is H-free can be
done in a constant number of rounds too. The constant also depends on how far
the input graph is from being H-free, and the dependence is identical to the
one in the case of testing triangles. Hence, in particular, testing whether a
graph is K_4-free, and testing whether a graph is C_4-free can be done in a
constant number of rounds (where K_k denotes the k-node clique, and C_k denotes
the k-node cycle). On the other hand, we show that testing K_k-freeness and
C_k-freeness for k>4 appear to be much harder. Specifically, we investigate two
natural types of generic algorithms for testing H-freeness, called DFS tester
and BFS tester. The latter captures the previously known algorithm to test the
presence of triangles, while the former captures our generic algorithm to test
the presence of a 4-node graph pattern H. We prove that both DFS and BFS
testers fail to test K_k-freeness and C_k-freeness in a constant number of
rounds for k>4
High Entropy Random Selection Protocols
We study the two party problem of randomly selecting a common string among all the strings of length n. We want the protocol to have the property that the output distribution has high Shannon entropy or high min entropy, even when one of the two parties is dishonest and deviates from the protocol. We develop protocols that achieve high, close to n, Shannon entropy and simultaneously min entropy close to n/2. In the literature the randomness guarantee is usually expressed in terms of “resilience”. The notion of Shannon entropy is not directly comparable to that of resilience, but we establish a connection between the two that allows us to compare our protocols with the existing ones. We construct an explicit protocol that yields Shannon entropy n- O(1) and has O(log ∗n) rounds, improving over the protocol of Goldreich et al. (SIAM J Comput 27: 506–544, 1998) that also achieves this entropy but needs O(n) rounds. Both these protocols need O(n2) bits of communication. Next we reduce the number of rounds and the length of communication in our protocols. We show the existence, non-explicitly, of a protocol that has 6 rounds, O(n) bits of communication and yields Shannon entropy n- O(log n) and min entropy n/ 2 - O(log n). Our protocol achieves the same Shannon entropy bound as, also non-explicit, protocol of Gradwohl et al. (in: Dwork (ed) Advances in Cryptology—CRYPTO ‘06, 409–426, Technical Report , 2006), however achieves much higher min entropy: n/ 2 - O(log n) versus O(log n). Finally we exhibit a very simple 3-round explicit “geometric” protocol with communication length O(n). We connect the security parameter of this protocol with the well studied Kakey
Bipartite graph matching computation on GPU
Abstract. The Bipartite Graph Matching Problem is a well studied topic in Graph Theory. Such matching relates pairs of nodes from two distinct sets by selecting a subset of the graph edges connecting them. Each edge selected has no common node as its end points to any other edge within the subset. When the considered graph has huge sets of nodes and edges the sequential approaches are impractical, specially for applications demanding fast results. In this paper we investigate how to compute such matching on Graphics Processing Units (GPUs) motivated by its increasing processing power made available with decreasing costs. We present a new data-parallel approach for computing bipartite graph matching that is efficiently computed on today’s graphics hardware and apply it to solve the correspondence between 3D samples taken over a time interval.
- …